106 research outputs found
The Use of Twitter to Track Levels of Disease Activity and Public Concern in the U.S. during the Influenza A H1N1 Pandemic
Twitter is a free social networking and micro-blogging service that enables its
millions of users to send and read each other's “tweets,” or
short, 140-character messages. The service has more than 190 million registered
users and processes about 55 million tweets per day. Useful information about
news and geopolitical events lies embedded in the Twitter stream, which
embodies, in the aggregate, Twitter users' perspectives and reactions to
current events. By virtue of sheer volume, content embedded in the Twitter
stream may be useful for tracking or even forecasting behavior if it can be
extracted in an efficient manner. In this study, we examine the use of
information embedded in the Twitter stream to (1) track rapidly-evolving public
sentiment with respect to H1N1 or swine flu, and (2) track and measure actual
disease activity. We also show that Twitter can be used as a measure of public
interest or concern about health-related events. Our results show that estimates
of influenza-like illness derived from Twitter chatter accurately track reported
disease levels
GET WELL: an automated surveillance system for gaining new epidemiological knowledge
<p>Abstract</p> <p>Background</p> <p>The assumption behind the presented work is that the information people search for on the internet reflects the disease status in society. By having access to this source of information, epidemiologists can get a valuable complement to the traditional surveillance and potentially get new and timely epidemiological insights. For this purpose, the Swedish Institute for Infectious Disease Control collaborates with a medical web site in Sweden.</p> <p>Methods</p> <p>We built an application consisting of two conceptual parts. One part allows for trends, based on user specified requests, to be extracted from anonymous web query data from a Swedish medical web site. The second conceptual part permits tailored analyses of particular diseases, where more complex statistical methods are applied to the data. To evaluate the epidemiological relevance of the output, we compared Google search data and search data from the medical web site.</p> <p>Results</p> <p>In the paper, we give concrete examples of the output from the web query-based system. We also present results from the comparison between data from the search engine Google and search data from the national medical web site.</p> <p>Conclusions</p> <p>The application is in regular use at the Swedish Institute for Infectious Disease Control. A system based on web queries is flexible in that it can be adapted to any disease; we get information on other individuals than those who seek medical care; and the data do not suffer from reporting delays. Although Google data are based on a substantially larger search volume, search patterns obtained from the medical web site may still convey more information from an epidemiological perspective. Furthermore we can see advantages with having full access to the raw data.</p
Measuring the impact of health policies using Internet search patterns: the case of abortion
<p>Abstract</p> <p>Background</p> <p>Internet search patterns have emerged as a novel data source for monitoring infectious disease trends. We propose that these data can also be used more broadly to study the impact of health policies across different regions in a more efficient and timely manner.</p> <p>Methods</p> <p>As a test use case, we studied the relationships between abortion-related search volume, local abortion rates, and local abortion policies available for study.</p> <p>Results</p> <p>Our initial integrative analysis found that, both in the US and internationally, the volume of Internet searches for abortion is inversely proportional to local abortion rates and directly proportional to local restrictions on abortion.</p> <p>Conclusion</p> <p>These findings are consistent with published evidence that local restrictions on abortion lead individuals to seek abortion services outside of their area. Further validation of these methods has the potential to produce a timely, complementary data source for studying the effects of health policies.</p
Prediction of Dengue Incidence Using Search Query Surveillance
Improvements in surveillance, prediction of outbreaks and the monitoring of the epidemiology of dengue virus in countries with underdeveloped surveillance systems are of great importance to ministries of health and other public health decision makers who are often constrained by budget or man-power. Google Flu Trends has proven successful in providing an early warning system for outbreaks of influenza weeks before case data are reported. We believe that there is greater potential for this technique for dengue, as the incidence of this pathogen can vary by a factor of ten in some settings, making prediction all the more important in public health planning. In this paper, we demonstrate the utility of Google search terms in predicting dengue incidence in Singapore and Bangkok, Thailand using several regression techniques. Incidence data were provided by the Singapore Ministry of Health and the Thailand Bureau of Epidemiology. We find our models predict incident cases well (correlation greater than 0.8) and periods of high incidence equally well (AUC greater than 0.95). All data and analysis code used in our study are available free online and can be adapted to other settings
A multi-level geographical study of Italian political elections from Twitter Data
In this paper we present an analysis of the behavior of Italian Twitter users during national political elections. We monitor the volumes of the tweets related to the leaders of the various political parties and we compare them to the elections results. Furthermore, we study the topics that are associated with the co-occurrence of two politicians in the same tweet. We cannot conclude, from a simple statistical analysis of tweet volume and their time evolution, that it is possible to precisely predict the election outcome (or at least not in our case of study that was characterized by a “too-close-to-call” scenario). On the other hand, we found that the volume of tweets and their change in time provide a very good proxy of the final results. We present this analysis both at a national level and at smaller levels, ranging from the regions composing the country to macro-areas (North, Center, South)
Simulation of an SEIR infectious disease model on the dynamic contact network of conference attendees
The spread of infectious diseases crucially depends on the pattern of
contacts among individuals. Knowledge of these patterns is thus essential to
inform models and computational efforts. Few empirical studies are however
available that provide estimates of the number and duration of contacts among
social groups. Moreover, their space and time resolution are limited, so that
data is not explicit at the person-to-person level, and the dynamical aspect of
the contacts is disregarded. Here, we want to assess the role of data-driven
dynamic contact patterns among individuals, and in particular of their temporal
aspects, in shaping the spread of a simulated epidemic in the population.
We consider high resolution data of face-to-face interactions between the
attendees of a conference, obtained from the deployment of an infrastructure
based on Radio Frequency Identification (RFID) devices that assess mutual
face-to-face proximity. The spread of epidemics along these interactions is
simulated through an SEIR model, using both the dynamical network of contacts
defined by the collected data, and two aggregated versions of such network, in
order to assess the role of the data temporal aspects.
We show that, on the timescales considered, an aggregated network taking into
account the daily duration of contacts is a good approximation to the full
resolution network, whereas a homogeneous representation which retains only the
topology of the contact network fails in reproducing the size of the epidemic.
These results have important implications in understanding the level of
detail needed to correctly inform computational models for the study and
management of real epidemics
Web Queries as a Source for Syndromic Surveillance
In the field of syndromic surveillance, various sources are exploited for outbreak detection, monitoring and prediction. This paper describes a study on queries submitted to a medical web site, with influenza as a case study. The hypothesis of the work was that queries on influenza and influenza-like illness would provide a basis for the estimation of the timing of the peak and the intensity of the yearly influenza outbreaks that would be as good as the existing laboratory and sentinel surveillance. We calculated the occurrence of various queries related to influenza from search logs submitted to a Swedish medical web site for two influenza seasons. These figures were subsequently used to generate two models, one to estimate the number of laboratory verified influenza cases and one to estimate the proportion of patients with influenza-like illness reported by selected General Practitioners in Sweden. We applied an approach designed for highly correlated data, partial least squares regression. In our work, we found that certain web queries on influenza follow the same pattern as that obtained by the two other surveillance systems for influenza epidemics, and that they have equal power for the estimation of the influenza burden in society. Web queries give a unique access to ill individuals who are not (yet) seeking care. This paper shows the potential of web queries as an accurate, cheap and labour extensive source for syndromic surveillance
Using Web Search Query Data to Monitor Dengue Epidemics: A New Model for Neglected Tropical Disease Surveillance
A variety of obstacles, including bureaucracy and lack of resources, delay detection and reporting of dengue and exist in many countries where the disease is a major public health threat. Surveillance efforts have turned to modern data sources such as Internet usage data. People often seek health-related information online and it has been found that the frequency of, for example, influenza-related web searches as a whole rises as the number of people sick with influenza rises. Tools have been developed to help track influenza epidemics by finding patterns in certain web search activity. However, few have evaluated whether this approach would also be effective for other diseases, especially those that affect many people, that have severe consequences, or for which there is no vaccine. In this study, we found that aggregated, anonymized Google search query data were also capable of tracking dengue activity in Bolivia, Brazil, India, Indonesia and Singapore. Whereas traditional dengue data from official sources are often not available until after a long delay, web search query data is available for analysis within a day. Therefore, because it could potentially provide earlier warnings, these data represent a valuable complement to traditional dengue surveillance
A New Approach to Monitoring Dengue Activity
Discusses informal surveillance tools for monitoring dengue activity, such as ProMED, GPHIN HealthMap and BioCaster
Assessing the impact of a health intervention via user-generated Internet content
Assessing the effect of a health-oriented intervention by traditional epidemiological methods is commonly based only on population segments that use healthcare services. Here we introduce a complementary framework for evaluating the impact of a targeted intervention, such as a vaccination campaign against an infectious disease, through a statistical analysis of user-generated content submitted on web platforms. Using supervised learning, we derive a nonlinear regression model for estimating the prevalence of a health event in a population from Internet data. This model is applied to identify control location groups that correlate historically with the areas, where a specific intervention campaign has taken place. We then determine the impact of the intervention by inferring a projection of the disease rates that could have emerged in the absence of a campaign. Our case study focuses on the influenza vaccination program that was launched in England during the 2013/14 season, and our observations consist of millions of geo-located search queries to the Bing search engine and posts on Twitter. The impact estimates derived from the application of the proposed statistical framework support conventional assessments of the campaign
- …